Module 1: Setting up the problem

Introduction

Geophysical surveys consist of a similar basic framework. An energy source is delivered into the earth, which can be natural (for example, the Earth's magnetic field) or human-made (current in the ground, acoustic wave energy, etc.), and this stimulates a response according to the variation in physical properties in the subsurface. At the surface, receivers pick up the a signal and record this as data.


The goal of inversion is to find a model of the physical property distribution in the earth that produced the data. This is a dificult process because (1) information about a physical property for each datum is encoded in a complex way, and (2) we have a finite amount of data and cannot represent the physical property distribution everywhere.

Inversion is a multistep process, often represented as a workflow insert workflow image here. The goal of this module is to cover the first section of the workflow, which discretizes the data and places the values of the functions onto a mesh. This will be done using the following steps:

(1) Start with an expression that relates a kernel function with the continuous distribution of a physical property.
(2) Discretize this expression, and introduce a simple example problem to illustrate the mathematics in detail.
(3) Define a mesh that organizes the data.
(4) Build up the matrix equation $d = Gm$.
(5) Generalize the form of the problem from the example.
(6) Implement the example problem in Python as a forward problem.

But first, here are some fundamental definitions:

The general mathematical description of the inverse problem can be written as follows:

\begin{equation} F_j[m]= d_j +n_j \quad \text{for} \quad j=1,..,N \; \text{where}\end{equation}


  • $\bf{F}$ is a foward modeling operator. $\bf{F}$ incorporates the survey design and the physics of the problem.
  • $\bf{m}$ is a generic symbol for a physical property distibution.
  • $\bf{d}$ represents the observed data, (sometimes also represented as $\bf{d^{obs}}$).
  • $\bf{n}$ is a term that represents additive noise.

This is, of course, the most general formuation of the problem. In this module we will consider the simplest case, which is (a) one dimensional (this can be likened to a survey that varies as a function of depth only) and (b) linear, in which our forward modeling operator becomes a matrix G in the matix equation:

\begin{equation}d = Gm\end{equation}

Step 1: Physical property distribution and the kernel function.

Each datum ($d_i$) collected is a volumetric response, that is to say, every datum measures the response of the whole volume (within the range of the system); it records the superposition of effects caused by all the material in the ground, and is therefore naturally represented as an integral. A "kernel function" or "sensitivity function", $g(x,y,z)$, shows how a datum is affected by all the subsurface. It describes the physics of the problem. The model, $m(x,y,z)$, represents the distribution of a physical property in the volume. Since each datum measures the response of the kernel function with the physical property distribution in the volume, for a continuous medium we can express this relationship as the inner product of the kernel function and the model. In the one dimensional case the expression for the ith datum is written as:

\begin{equation}d_i = \int_a^b g_i(x) m(x) dx \end{equation}


where again: \begin{equation} d :=\text{measured data} \\ g :=\text{kernel function} \\ m :=\text{physical property}\\ \end{equation}

Step 2: Discretize the expression

While the integral expression describes the inner product in a continuum, it is not possible to maintain this representation in our data space, so the above equation must be discretized. Let us consider the case where we have $N$ data, and as we build up the matrix equation, let $i$ be the rows and $j$ be the columns of our matrix:

\begin{equation}d_i = \int_a^b g_i(x) m(x) dx \; \Rightarrow \; \sum_{j=1}^N g_i (x_j) m_j \Delta x\end{equation}


A "Toy Problem"

Consider a simple case where we have a one dimensional, linear problem that will generate two data points, $d_1$ and $d_2$,from five physical property values
($m_1, m_2, m_3, m_4, m_5$), and these two data points are generated using two kernel functions $g_1$ and $g_2$. Further, let us assume that our domain of interest lies on the interval [0,1]. The equation above is then expressed as the following two equations, one for each datum:

\begin{equation}d_1 = \int_0^1 g_1(x) m(x) dx \; \Rightarrow \; \sum_{j=1}^N g_1 (x_j) m_j \Delta x\end{equation}


\begin{equation}d_2 = \int_0^1 g_2(x) m(x) dx \; \Rightarrow \; \sum_{j=1}^N g_2 (x_j) m_j \Delta x\end{equation}

Given that our problem is small by design, it is instructive to visualize the summation notation as matrix-vector products. Doing so yields the following expressions:

(a) Data. We can collect our two data points into a column vector. Our data in vector notation is given by:

\begin{equation} d = \left[ \begin{array}{c} d_1\\ d_2 \end{array} \right] \end{equation}


(b) The x-spacing. The x-spacings, $\Delta x$, are represented by a diagonal matrix. Here we are in a one dimensional case, but in general our data will reside in a volume, so let this matrix be represented as $V$. In the most general situation, the x-spacings need not be equal, and there can be significant variation in the distances within a grid, depending on the amount of resolution one desires at a particular location. For the moment, let's ignore this complexity and assume equal spacing in our grid. Then let $V=diag(\Delta x)$, and given the dimensions of our problem, $V$ appears as follows:

\begin{equation} V=\begin{bmatrix} \Delta x &0 &0 &0 &0 \\ 0 &\Delta x &0 &0 &0 \\ 0 &0 &\Delta x &0 &0 \\ 0 &0 &0 &\Delta x &0 \\ 0 &0 &0 &0 &\Delta x\\ \end{bmatrix} \end{equation}


(c) The model. The model, $m$, is a column vector with five rows: \begin{equation} m= \left[ \begin{array}{c} m_1\\ m_2\\ m_3\\ m_4\\ m_5 \end{array} \right] \end{equation}

(d) The kernel functions. Recall that a kernel function shows how a datum is affected by all the subsurface. The full expression for the kernel function will be developed further in the next sections, but for the moment, let us put each kernel function on a row of a matrix, such that $g_1$ will be on row 1, and $g_2$ on row 2, and define this matrix as $\widetilde{G}$. This will yield the matrix: \begin{equation} \widetilde{G}= \begin{bmatrix} g_{11} &g_{12} &g_{13} &g_{14} &g_{15}\\ g_{21} &g_{22} &g_{23} &g_{24} &g_{25} \end{bmatrix} \end{equation}

Using the above, the assembled matrix vector forms for these equations becomes:

\begin{equation} \left[ \begin{array}{c} d_1\\ d_2\\ \end{array} \right] = \begin{bmatrix} g_{11} &g_{12} &g_{13} &g_{14} &g_{15}\\ g_{21} &g_{22} &g_{23} &g_{24} &g_{25} \end{bmatrix} \begin{bmatrix} \Delta x &0 &0 &0 &0 \\ 0 &\Delta x &0 &0 &0 \\ 0 &0 &\Delta x &0 &0 \\ 0 &0 &0 &\Delta x &0 \\ 0 &0 &0 &0 &\Delta x\\ \end{bmatrix} \left[ \begin{array}{c} m_1\\ m_2\\ m_3\\ m_4\\ m_5 \end{array} \right] \end{equation}

Or to put it succinctly, $d = \widetilde{G} \, diag(\Delta x)\, m$.

Step 3: Set up the mesh

Now that the data has ben discretized, lets look at where the data is to be placed in our data space. First, subdivide the one dimensional domain into cell centers and nodes, with spacings of $\Delta x$:

For convenience (and for applications that will be discussed in later modules), the model values reside in the cell centers, while the values for the kernel functions reside on the nodes.


Note that the kernel function in our example (represented above) has six values and the model values are five; moreover, the x-coordinates where $g(x) $ and $m(x)$ are evaluated are not coincident, and this leads to a complication when we want to perform the inner product of m and g. What is needed is a way to evaluate the kernel functions at the cell centers. To do this, we employ the trapeziodal rule for approximating integration. Simply put, to obtain the values of the kernel function on the cell centers, take the average of the kernel function values on the adjacent nodes. Let us then define two kernel function matrices, one with values evaluated on the cell centers, $G_c$ (represented by the black circles below), and another with the values evaluated on the nodes, $G_n$ (in white).

The relationship between $G_c $ and $G_n$ is an "averaging matix", $A_v$, such that $G_c = A_v G_n$. Putting each kernel function in the columns for both $G_c$ and $G_n$, and using again the dimensions of our toy problem, this relation appears as follows:

\begin{equation} \begin{bmatrix} g_{c1}(x_1) & g_{c2}(x_1) \\ g_{c1}(x_2) & g_{c2}(x_2) \\ g_{c1}(x_3) & g_{c2}(x_3) \\ g_{c1}(x_4) & g_{c2}(x_4) \\ g_{c1}(x_5) & g_{c2}(x_5) \\ \end{bmatrix} = \frac{1}{2} \begin{bmatrix} 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 1 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 & 1\\ \end{bmatrix} \begin{bmatrix} g_{n1}(x_1) & g_{n2}(x_1) \\ g_{n1}(x_2) & g_{n2}(x_2) \\ g_{n1}(x_3) & g_{n2}(x_3) \\ g_{n1}(x_4) & g_{n2}(x_4) \\ g_{n1}(x_5) & g_{n2}(x_5) \\ g_{n1}(x_6) & g_{n2}(x_6) \\ \end{bmatrix} \end{equation}


Meanwhile, the relationship between $G_c$ and $\widetilde{G}$ in step 2 is such that $G_c = \widetilde{G}^T$.

Step 4: Build up the matrix equation $d=Gm$

We now have all the required building blocks to assemble the matrix equation. At the end of step 2 we arrived at $d = \widetilde{G} \, diag(\Delta x)\, m$, and from the previous step we have $G_c = \widetilde{G}^T = A_v G_n$. Put more orderly, $\widetilde{G} = (A_v G_n)^T$, so substuting this into our expression gives:

\begin{equation} d = (A_v G_n)^T \, diag(\Delta x)\, m \end{equation}


If we group matrices together and let $G = (A_v G_n)^T diag(\Delta x)$ then we arrive at $d=Gm$, our desired form. Referring again to our example, this would appear as follows:

\begin{equation} \left[ \begin{array}{c} d_1\\ d_2\\ \end{array} \right] = \frac{1}{2} \left( \begin{bmatrix} 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 1 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 & 1\\ \end{bmatrix} \begin{bmatrix} g_{n1}(x_1) & g_{n2}(x_1) \\ g_{n1}(x_2) & g_{n2}(x_2) \\ g_{n1}(x_3) & g_{n2}(x_3) \\ g_{n1}(x_4) & g_{n2}(x_4) \\ g_{n1}(x_5) & g_{n2}(x_5) \\ g_{n1}(x_6) & g_{n2}(x_6) \\ \end{bmatrix} \right)^T \begin{bmatrix} \Delta x &0 &0 &0 &0 \\ 0 &\Delta x &0 &0 &0 \\ 0 &0 &\Delta x &0 &0 \\ 0 &0 &0 &\Delta x &0 \\ 0 &0 &0 &0 &\Delta x\\ \end{bmatrix} \left[ \begin{array}{c} m_1\\ m_2\\ m_3\\ m_4\\ m_5 \end{array} \right] \end{equation}


Step 5: Generalize the form of the problem

In the above case we had two data points and five model values. We can generalize this to larger data sets easily. In the case where we have $M$ measured data and $N$ model values, we obtain a matrix equation of the following dimensions:

\begin{equation} (M \times 1) = [(N \times (N+1)) \; ((N+1) \times M)]^T (N \times N) (N \times 1)\\ d \qquad = \qquad \qquad \qquad \quad [A_v G_n]^T \qquad \qquad diag(\Delta x) \quad m \end{equation}

Step 6: Implement the example problem in Python.

Next, we will build up our matrices one by one in Python. First import the SimPEG and numpy packages:


In [2]:
from SimPEG import *
from IPython.html.widgets import interactive
import numpy as np

Let start by formulating the forward problem, that is, let us assume that we have our model values $m$ and seek to generate synthetic data $d$. Recall that we are building up the matrix equation: $d = (A_v G_n)^T \, diag(\Delta x)\, m$, so let's start with the right hand side and go over each matrix, one by one.

(1) The $A_v$ matrix. We can build the "averaging matrix" as follows:


In [6]:
N=5              # RecalL is the number of data we have in m. 
n=N+1            # Define n as the N+1 dimension of the matrix   
w=n-1            # Define w and the n-1  dimentsion of the matix   
s = (N,n)        # Store matrix values 
Av = np.zeros(s) # Create a matrix of zeros of the correct dimensions 
                 # and fill in with elements usin the loop below (note the 1/2 is included in here). 
for i in range(N):
    j=i
    k=i+1
    Av[i,j] = 0.5  
    Av[i,k] = 0.5
print Av


[[ 0.5  0.5  0.   0.   0.   0. ]
 [ 0.   0.5  0.5  0.   0.   0. ]
 [ 0.   0.   0.5  0.5  0.   0. ]
 [ 0.   0.   0.   0.5  0.5  0. ]
 [ 0.   0.   0.   0.   0.5  0.5]]

(2) The kernel matrix on the nodes, $G_n$. To build this matrix, we must first define a sensitivity function that describes some physical phenomenon. For the sake of our example, even though it has no direct, physical meaning, let us define our kernel function for the toy problem as follows:

\begin{equation} g_j(x) = e^{jpx} \cos(2 \pi j q x) \end{equation}


In [26]:
g = lambda x, k, p, q: np.exp(-p*k*x)*np.cos(2*np.pi*q*k*x)   # create an anonymous function as immediately above
x=np.linspace(0,1,6)                                          # define the nodes of our x-array
#x = np.array([0., 0.2, 0.4, 0.6, 0.8, 1.])
p = 0.01                                                      # Set values for p, q, j
q = 0.1
j = np.array([1, 2])                                          
Gn = np.zeros((len(x), len(j)))                               # preallocate a matrix Gn, and evaluate functions in loop below.

for i in range(len(j)):
    f = g(x,k[i],p,q)
    Gn[:,i] = f
#print f
print Gn


[[ 1.          1.        ]
 [ 0.99013245  0.96471657]
 [ 0.96471657  0.86932419]
 [ 0.92421453  0.72027328]
 [ 0.86932419  0.52732179]
 [ 0.80096714  0.30289805]]

(3) The volume matrix $V$. This simply consists of making an $N \times N$ array of x-spacings, $\Delta x.$


In [13]:
# make the delta x array

Deltax = 0.2*np.ones(N) # set x-spacings
V = np.diag(Deltax)     # create diagonal matrix     
print V


[[ 0.2  0.   0.   0.   0. ]
 [ 0.   0.2  0.   0.   0. ]
 [ 0.   0.   0.2  0.   0. ]
 [ 0.   0.   0.   0.2  0. ]
 [ 0.   0.   0.   0.   0.2]]

(4) Input the model values, $m$. Given that we are making a forward problem, we assume these values as given, so we input ficticious values:


In [15]:
m = np.array([0.02, 0.05, 0.09, 0.07, 0.04])
print m


[ 0.02  0.05  0.09  0.07  0.04]

(5) Generate our two data values. The remaining step is simply to perform the matix-vector multiplication :


In [24]:
d = np.dot(np.dot(np.transpose(np.dot(Av, Gn)), V),m)
print d


[ 0.04999083  0.03946006]

In [45]:
# Put x array on cell centers as follow

x = np.array([0., 0.2, 0.4, 0.6, 0.8, 1.])
xc = 0.5*(x[1:] + x[0:-1])
print x
print xc


[ 0.   0.2  0.4  0.6  0.8  1. ]
[ 0.1  0.3  0.5  0.7  0.9]

This is just an extra cell with more equations if needed:

\begin{equation} \left[ \begin{array}{c} d_1\\ d_2\\ \end{array} \right] = \frac{1}{2} \left( \begin{bmatrix} g_{n1}(x_1) + g_{n1}(x_2) & g_{n2}(x_1) + g_{n2}(x_2) \\ g_{n1}(x_2) + g_{n1}(x_3) & g_{n2}(x_2) + g_{n2}(x_3) \\ g_{n1}(x_3) + g_{n1}(x_4) & g_{n2}(x_3) + g_{n2}(x_4) \\ g_{n1}(x_4) + g_{n1}(x_5) & g_{n2}(x_4) + g_{n2}(x_5) \\ g_{n1}(x_5) + g_{n1}(x_6) & g_{n2}(x_5) + g_{n2}(x_6) \\ \end{bmatrix} \right)^T \begin{bmatrix} \Delta x &0 &0 &0 &0 \\ 0 &\Delta x &0 &0 &0 \\ 0 &0 &\Delta x &0 &0 \\ 0 &0 &0 &\Delta x &0 \\ 0 &0 &0 &0 &\Delta x\\ \end{bmatrix} \left[ \begin{array}{c} m_1\\ m_2\\ m_3\\ m_4\\ m_5 \end{array} \right] \end{equation}